Robust Text Classification under Confounding Shift
نویسندگان
چکیده
منابع مشابه
Robust Text Classification in the Presence of Confounding Bias
As text classifiers become increasingly used in real-time applications, it is critical to consider not only their accuracy but also their robustness to changes in the data distribution. In this paper, we consider the case where there is a confounding variable Z that influences both the text features X and the class variable Y . For example, a classifier trained to predict the health status of a...
متن کاملSocial Media Text Classification under Negative Covariate Shift
In a typical social media content analysis task, the user is interested in analyzing posts of a particular topic. Identifying such posts is often formulated as a classification problem. However, this problem is challenging. One key issue is covariate shift. That is, the training data is not fully representative of the test data. We observed that the covariate shift mainly occurs in the negative...
متن کاملRobust Supervised Learning under Distribution Shift Uncertainty
Distributionally Robust Supervised Learning (DRSL) is necessary for building reliable machine learning systems. When machine learning is deployed in the real world, its performance can be significantly degraded because test data may follow a different distribution from training data. Previous DRSL minimizes the loss for the worst-case test distribution. However, our theoretical analyses show th...
متن کاملA Robust Learning Approach for Text Classification
Previous learning approaches often assume that every part of a positive training document of a class is relevant to that class. However, in practice, it is often the case that only one or a few parts in the training document are really relevant to the class. To overcome this limitation, we propose another learning approach based on relevance-based topic model, an extension of well-known Latent ...
متن کاملA Robust Model for Intelligent Text Classification
Methods for taking into account linguistic content into text retrieval are receiving a growing attention [16],[14]. Text categorization is an interesting area for evaluating and quantifying the impact of linguistic information. Works in text retrieval through Internet suggest that embedding linguistic information at a suitable level within traditional quantitative approaches (e.g. sense distinc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Artificial Intelligence Research
سال: 2018
ISSN: 1076-9757
DOI: 10.1613/jair.1.11248